Representative Objects: Concise Representations of Semistructured, Hierarchial Data

نویسندگان

  • Svetlozar Nestorov
  • Jeffrey D. Ullman
  • Janet L. Wiener
  • Sudarshan S. Chawathe
چکیده

In this paper we introduce the representative object, which uncovers the inherent schema(s) in semistructured, hierarchical data sources and provides a concise description of the structure of the data. Semistructured data, unlike data stored in typical relational or object-oriented databases, does not have fixed schema that is known in advance and stored separately from the data. With the rapid growth of the World Wide Web, semistructured hierarchical data sources are becoming widely available to the casual user. The lack of external schema information currently makes browsing and querying these data sources inefficient at best, and impossible at worst. We show how representative objects make schema discovery efficient and facilitate the generation of meaningful queries over the data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Representative Objects: Concise Representations of Semistructured, Hierarchical Data

In this paper we introduce the representative object, which uncovers the inherent schema(s) in semistructured, hierarchical data sources and provides a concise description of the structure of the data. Semistructured data, unlike data stored in typical relational or object-oriented databases, does not have fixed schema that is known in advance and stored separately from the data. With the rapid...

متن کامل

Integration of Heterogeneous Semistructured Data Models in the Canonical One

To provide for interoperability of heterogeneous information objects it is required to establish a global, uniform view of the underlying digital collections and services. An information model is needed which is able to express uniformly the structure and semantics of heterogeneous data collections as well as the available services. Usually the mediator's layer is introduced to provide the user...

متن کامل

Closed-set-based Discovery of Bases of Association Rules

The output of an association rule miner is often huge in practice. This is why several concise lossless representations have been proposed, such as the “essential” or “representative” rules. We revisit the algorithm given by Kryszkiewicz (Int. Symp. Intelligent Data Analysis 2001, Springer-Verlag LNCS 2189, 350–359) for mining representative rules. We show that its output is sometimes incomplet...

متن کامل

Closed Set Based Discovery of Representative Association Rules

The output of an association rule miner is often huge in practice. This is why several concise lossless representations have been proposed, such as the “essential” or “representative” rules. We revisit the algorithm given by Kryszkiewicz (Int. Symp. Intelligent Data Analysis 2001, Springer-Verlag LNCS 2189, 350–359) for mining representative rules. We show that its output is sometimes incomplet...

متن کامل

An Integrated Spatial Statistics Package for Map Data Analysis

In the analysis of geocoded statistical data, common practice has been to treat the data in isolation from its locational or spatial characteristics. This results in a potentially critical loss of the spatial information that is contained in the mapped representation of the statisti cal data but not in the application of aspatial statistical techniques such as cross-sectional regression. One ex...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997